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Abstract — In this paper, we present a low-complexity joint 
detection-decoding algorithm for nonbinary LDPC coded- 
modulation systems. The algorithm combines hard-decision de- 
coding using the message-passing strategy with the signal detector 
in an iterative manner. It requires low computational complexity, 
offers good system performance and has a fast rate of decoding 
convergence. Compared to the y-ary sum-product algorithm 
(QSPA), it provides an attractive candidate for practical appli- 
cations of q-ary LDPC codes. 

I. Introduction 

Nonbinary low-density parity-check (LDPC) codes were 
first introduced by Gallager in [1] based on modulo arithmetic. 
In |2|, Davey and MacKay presented a class of nonbinary 
LDPC codes defined over finite field GF(q) with q > 2. They 
also introduced a sum-product algorithm (SPA) for decoding 
y-ary LDPC codes, named QSPA. Now, it has been shown that 
nonbinary LDPC codes have better performance than binary 
LDPC codes |2|, 0, especially when combined with higher- 
order modulations. Recently, a surge appears in the study of 
nonbinary LDPC codes @, (6), Q, H, 0. 

However, the advantages of nonbinary LDPC codes over 
its binary counterpart are balanced by their higher decod- 
ing complexity. To reduce the decoding complexity, Davey 
and MacKay proposed a more efficient QSPA, called fast 
Fourier transform based QSPA (FFT-QSPA), for decoding 
LDPC codes over GF(2 P ) |4|. Moreover, a simplified decoding 
algorithm called extended min-sum (EMS) was proposed by 
Declercq and Fossorier in J5] to further reduce decoding 
complexity. It provides a good candidate for decoding y-ary 
LDPC codes with small q. For larger field size (say, q > 32), 
the sort operations required by the EMS algorithm will incur 
higher complexity. As a result, the decoding complexity is still 
a concern for practical implementation of y-ary LDPC coded 
systems. 

Most recently, Mobini et al lfl3ll and Huang et al ifTBI 
developed reliability-based decoding algorithms for binary 
LDPC codes with low complexity. Motivated by their work 
and [ 14], this paper will explore hard-decision based decoding 
for y-ary LDPC codes, and present a low-complexity joint 



detection-decoding algorithm for y-ary LDPC-coded modu- 
lation systems, which provides efficient trade-off between 
system performance and implementation complexity. 

The algorithm is devised to combine the simplicity of hard- 
decision decoding with the good performance of message- 
passing algorithms. In the proposed scheme, signal detection 
and decoding are integrated as a whole, and the input signal 
vector to detector is updated in an iterative way. At each 
iteration, the updated hard-decision results from detector are 
delivered to the LDPC decoder which performs hard-decision 
decoding using message-passing algorithm. The output of 
decoder is then fed back to detector, with which the received 
signal points are updated such that they are progressively close 
to the transmitted signals in observation space. This can be 
viewed as an iterative denoising processing. Compared to the 
FFT-QSPA, the proposed algorithm requires lower computa- 
tional complexity and has fast rate of decoding convergence. 

II. Nonbinary LDPC-Coded Modulations 
A. System Model 

The nonbinary LDPC-coded modulation system under con- 
sideration is shown in Fig. Q] Assume that an LDPC code 
C[N, K] over GF(y) with y > 2 is used in conjunction 
with a two-dimensional signal constellation X of size \X\. 
The input vector of information symbols, u G GF(q) K , 
is first encoded by the LDPC encoder into a codeword 
v = (vo, V\, Ujv— i) € C. The corresponding code rate 
R c = K/N. The codeword v is then mapped to X, producing 
the modulated signal vector x = (xq,xi, ...,xn-i) with 
Xj = M{vj) £ X, where M(-) stands for the signal mapping 
function. In this paper, we always assume the constellation 
size is equal to the finite field size, i.e., \X\ = q. The spectral 
efficiency for this coded-modulation system is 



R c \og 2 \X | bits/signal. 



(1) 



Suppose that the complex signal vector x is transmit- 
ted over the AWGN channel. The received vector y = 
(yo,yi, -,VN-i) is then given by 



j=0,l,...,N-l, 



(2) 
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Fig. 1. A nonbinary LDPC-coded modulation system 



where rij ~ £/V(0, No) are independent and identically dis- 
tributed complex Gaussian random variables with zero mean 
and variance Nq/2 per dimension. Denote by E s = E[|x,-| 2 ] 
the average energy per transmitted signal. Then the average 
received signal-to-noise ratio (SNR) is 

SNR = E s /N = P E b /N , 

where Ejj denotes the average energy per information bit. 

In this paper, we consider a hard-decision based iterative 
detection-decoding strategy. In each iteration the signal detec- 
tor makes hard-decision about v based on the updated received 
vector y, producing vector z = (zq, Z\, z^-x) with Zj G 
GF(q); then the decoder performs hard-decision decoding with 
z as the input. The hard extrinsic-information produced by the 
decoder is then fed back to the signal detector to update y. 
We will show that with this decoding strategy, good system 
performance can be achieved with reduced complexity. 



B. LDPC Codes over GF(q) 

A q-ary LDPC code C of length N over GF(q) is given by 
the null space of a sparse M x N parity-check matrix H = 
[hij] over GF(q), where M is the number of check equations 
and M = N — K if H is full rank. Let v = {vq,v\,..., vat-i) 
be a codeword in C. Then the parity-check constraints can be 
expressed as vH T = 0, or 



N-l 

E 

3=0 



0,1,...M-1, 



(3) 



where the operations of multiplication and addition are all 
defined over GF(q). If the matrix H has constant row weight 
d c and constant column weight d v , then the corresponding 
code is called a (d v , d c )-regular g-ary LDPC code. 

Similar to its binary counterpart, a g-ary LDPC code can 
also be described using a Forney-style factor graph 1161 . as 
depicted in Fig. [2] where M. denotes the mapper/demapper. 
The graph has N variable nodes corresponding to coded 
symbols, and M check nodes corresponding to parity-check 
equations. For convenience, we define the following two index 
sets 

= {i\hij ^ 0, 0<i<M}, (4) 
■Afc(t) = {j\hi,j ^ 0, < j < N}. (5) 




Fig. 2. Forney-style factor graph of the nonbinary LDPC-coded system 



III. Joint Detection-Decoding Algorithm for 
q-ARY LDPC-Coded Systems 

In this section, we will develop an iterative joint detection- 
decoding algorithm for the LDPC-coded modulation system 
shown in Fig. Q] The whole algorithm is a hard-decision based 
message-passing algorithm (MPA) operating on the factor 
graph. Assume that a (d v , c£ c )-regular g-ary LDPC code is 
used. 

A. Signal Detection and Message-Passing Based on Hard 
Information 

Let x = (io, x\, xn-i) with £j G X denote the estimate 
of x made by the signal detector based on the received vector 
y. Assume that all the constellation points in X are used equal 
likely. Then with the maximum likelihood decision rule, the 
detected signal x is given by 

Xj = argmin^^llyj- - x\\, j = 0,1,..., N- 1, (6) 

where || • || denotes the Euclidean (I2) norm. The input of the 
decoder z is simply the demapping output of x, i.e., 

zj =M-\x j ) 6GF( 9 ), j = 0,1,..., N-l. (7) 

The vector z is a codeword if and only if its A/-tuple syndrome 
s = (s , s%, SAf-i) equals 0, i.e., 

s = zH T 



0. 

For < i < M, the component s, of s is given by 

hi 



(8) 



(9) 



which is called a check-sum of received symbols. A received 
symbol zj is said to be checked by Si if hij £ GF(g)\{0}. 
From (0 with s, = 0, the estimate of Vj given by other vari- 
able nodes participating in the check-sum Si can be expressed 
as 

^( E 

j'&/Vc(i)\j 



(10) 



This is the update rule for check-to-variable node message. 
Since the column weight of H is d v , every symbol Vj can 
receive d v estimates along the set of edges £ Mv{j)}< 

as depicted in Fig. [2] As can be seen from (O and ( fTOb , 
only hard information propagates between variable and check 
nodes, i.e., the variable nodes send their hard-decision decoded 
symbols to the check nodes, and the check nodes simply 
compute the syndromes and send estimates back to their 
adjacent variable nodes. The estimate cry for vj can be 
considered as the extrinsic information |[f5l . 

B. Iterative Detection-Decoding 

We now proceed to consider the update rule for variable 
nodes. Refer to Fig. |2 Assume that a variable node has 
received the extrinsic information from adjacent check nodes. 
Different from general message-passing algorithms, we will 
use this information to update the received samples to improve 
the reliability measure of received signal. To do this, an 
iterative process is performed between variable nodes and 
signal detector. For simplicity, the proposed iterative joint 
detection-decoding algorithm will be referred to as IJDD 
hereafter. 

In the following, we first introduce some notations used for 
the IJDD algorithm. Let k max be the maximum number of 
iterations to be performed. For < k < k max , let: 

• = (Ho > Vi > ■•■■> yjv-i) ^ e tne m P ut vector to the 

signal detector in the fcth iteration; x( fe ) and be 
the corresponding detected signal vector and the output 
decision vector. 



With the above discussions, the message update rule for 
variable nodes can be formulated as follows. 

Message Update Rule for Variable Nodes: Make estima- 

tion on v[ k ^ using (fTTT) based on Icr-^l. j 



0,1, ...,N — 1, and evaluate /• (a max ) and A/, 



P (fc) 



,(fc) ,(fe 



3, 



Then 

the variable nodes send the triples (jjj k \ A/^ , ff\a max )), 
< 3 < N — 1, to detector, where the following operations 
are performed on y 



(k). 



y 



(k+i) 



y 



(k) 



Ak)f(k) 



(12) 



Here and Q k ' are given based on and 

(t>f ) ,A/ i ( * ) ,/j* ) (a roox )), < j < N - 1, which can be 
described as follows. 



If Miv^) € V(yf\r), then 



(fe) 



*(fc) 



and 



Ak) _ (h) 

M{vf ] )-xf\ 



)/dv, 



ifM(vf) = xf ) 



(13) 



if Af (fc) > T 
JkUT <14) 



Otherwise set L 



(k) 



and ^ 



(fe) 



0. 



Note that V(y 



(k) 
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specifies a region where Xj is located 



be the syndrome of z^ k \ most likely. Only M.(vj ) within this region are used to 



a\ J be the extrinsic information for Vj given by the zth 
check-sum involving vj in the kth iteration. 
T^{yj k \ r ) denote a valid search sphere of radius r 
centered at y 



update yj fc \ In this paper we set r to be 1.415d m i„ and T to 
be 3, where d m i n is the minimum Euclidean distance among 
constellation points. Moreover, the case of M(v^)=Xj 



(k) 



in ( fT3l indicates that, y^' may be decoded into x^p with 



(k) 



(k) 

high confidence. In other words, y]- is considered as the 

(fe) 



• hj (p, q) = q p be a correction vector directed from 

the point p to the point q. For brevity, we will use hf } nois y version of xf>. However, instead of instantly setting 



forLf(p,q). 

• fj k \a), a € GF(g) denote the number of occurrences of 
the element a in {<r|*- }i&j\f v (j)- 
Clearly, < /j fc) (a) < d v and EagGF( g ) = d - 
fj k \a) indicates a reliability measure for decoding Zj into 
the symbol a. Let 



y^j to be ah , cautious shift is operated on towards 



Ak) 



While in the case of J\d(Vj) ^ ^f' ^ received 
signal will be shifted towards the decision boundary of the 
two candidates, achieving a trade-off between the two choices. 



x(fc) 



Here the decision boundary of A4(v^) and x { f is the 
bisector perpendicular to 
Then using Lj- and 



>.(*>) 



W _ kAlid)\ Ak) 



arg max {/| fe) (a)} 



= M{vY')-x) . 

the detector updates currently 

(fe+i) 



a£GF(q) 



and 



A/f =ff\a max ) 



^ max {fj k \a)}, 

where a max is the element in GF(q) that has the highest 
reliability for Vj, and A/j fc ' ) represents the difference in 
number of votes between the two highest-voted candidates 
for vj in the fcth iteration. With the plurality voting rule, we 
choose 

„(fc) 



(11) input vector according to ( fT2b . Based on , new hard- 

decision is made, and the results are delivered to the variable 
nodes as the updated message passed from detector to the 
decoder. The whole working process of the IJDD algorithm is 
shown in Fig. [3] 

In summary, the IJDD algorithm can be formulated as 
follows. 
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Fig. 3. Iterative joint detection-decoder. 



• Initialization. 

Set k = 0, and y( 0) = y. 

• Repeat the following while k < k max 

1) Signal detection: 

- For j = 0, 1, ...,N—1, make ML decisions based on 
yf\ yielding Xj and Zj as in (|6]l and (0. 

- Pass the message to LDPC decoder. 

2) Compute syndrome and do check-node update: 

- Compute syndrome vector = z^ fc 'H T . 

- If a'*' = 0, then terminate iteration and output 
y(k) _ z (fc) as ,-jjg (j ecoc i ec i codeword; 

- else for i = to M - 1 and j = to N - 1, 
compute check-to-variable messages af^ as in ( fTOb . 

3) Variable-node update and correction: 

For j = to N - 1, 

- evaluate (vf \ A/f \ /f } (<w)) based on {ag} 
at variable nodes; 

- send the message triples {vf\ A/j fe) , f!j k '(a max )) 
to detector and perform signal corrections as done in 
Section IllFBl 

4) k = k + l, entering next iteration. 

• If k = k max , declare a decoding failure. 

It is worth mentioning that in the IJDD algorithm differs 
from QSPA/FFT-QSPA in two aspects: 1) In the IJDD algo- 
rithm, only simple operations such as additions, comparisons, 
look-up tables, negligible amount of real operations and finite 
field operations are required; 2) The iterative process based 
on hard information is performed among detector/demapper, 
variable nodes and check nodes, while in the QSPA/FFT- 
QSPA, soft information propagates only between variable and 
check nodes. 

From the above, it can be seen that the IJDD algorithm 
is easy to implement, and can achieve high speed. However, 
like existing reliability-based decoding algorithms, to ensure 
the reliability of majority voting, the column weights of H 
have to be relatively large. It seems not easy for randomly 
constructed g-ary LDPC codes to fulfill this requirement, 
thus application of IJDD algorithm is restricted to g-ary 
LDPC codes constructed based on finite fields ITOl or finite 
geometry [11|. 

IV. Simulation Results 

In this section, two examples of g-ary LDPC codes are 
provided to demonstrate the effectiveness of the proposed 
IJDD algorithm. 

Example 1: Consider a 16-ary (255, 175) regular LDPC 
code constructed based on finite fields. The factor graph of this 



"k. v ' • 



Rely] 

(a) Received signal space 



Rely] 

(b) Corrected signal space 

Fig. 4. Scatter plot of signal space before and after decoding with 10 
iterations 



code has 255 variable nodes and 255 check nodes (including 
175 redundant check equations). Both the row and column 
weights are 16. With the use of 16-QAM signaling over the 
AWGN channel, scatter plots for signal vectors before and 
after decoding using IJDD with 10 iterations at E^/Nq = 8.0 
dB are shown in Fig. |4(a) and Fig. |4(b)| respectively. Clearly, 
the scatter plot in Fig. 4(b)| corresponds to an ideal signal 
constellation, i.e., all received samples have been shifted to 
the probable originally transmitted constellation points. 

Shown in Fig.|5]are the symbol and word error performances 
of this coded system decoded using the IJDD algorithm and 
FFT-QSPA with 50 iterations. It is seen that at a SER of 10~ 6 , 
the IJDD algorithm performs only 0.67dB away from the FFT- 
QSPA. Similar observation can be made for WER. 

To illustrate the rate of decoding convergence of the IJDD 
algorithm, simulations were also carried out for k max = 10 
and k max = 5, respectively, with results shown in Fig. [6] 
It is seen that with 10 and 50 iterations, the BER curves of 
the IJDD algorithm nearly overlap each other. Even with 5 
iterations, the loss of performance is only 0.35 dB compared 
to IJDD with 50 iterations. 

Example 2: Consider a 32-ary (1023, 781) regular LDPC 
code constructed based on finite fields. The row redundancy 
is 781 and both the row and column weights are 32. The 
performance of this code incorporated with 32-QAM modula- 
tion over the AWGN channel is illustrated in Fig. [71 where the 
codewords were obtained by encoding the randomly generated 
source data. Surprisingly, the IJDD algorithm outperforms the 
FFT-QSPA by 1.0 dB at a BER of 10~ 5 . This result may be 
caused by large column weights and large row redundancy of 
the code, which can offer high reliability for IJDD. In addition, 




Fig. 5. Error performance of the 16-ary (255,175) LDPC code decoded with 
the FFT-QSPA and the IJDD algorithm (16-QAM) 
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Fig. 6. Rate of decoding convergence of the IJDD algorithm for decoding 
the 16-ary (255,175) LDPC code (16-QAM) 

the large column weights and large row redundancy of the code 
may make the FFT-QSPA algorithm not suitable for decoding 
it. 

V. Conclusions 

In this paper, we propose an iterative joint detection- 
decoding algorithm for g-ary LDPC-coded systems, which can 
be characterized as a hard-decision based message-passing 
algorithm and so has low computational complexity. For q- 
ary LDPC codes with large row redundancy and column 
weights, the proposed algorithm can offer good performance or 
even outperforms the FFT-QSPA with a lower computational 
complexity. Furthermore, the fast rate of decoding convergence 
of our proposed algorithm makes it particularly attractive, 
thus offering an attractive candidate for practical applications 
of g-ary LDPC codes. Although only regular LDPC codes 
are considered here, the proposed algorithm can be extended 
directly to decode irregular g-ary LDPC codes. 
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